Skip to content

issue 1111 merge new stac #412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 10, 2025
Merged

issue 1111 merge new stac #412

merged 4 commits into from
Jul 10, 2025

Conversation

ElienVandermaesenVITO
Copy link
Contributor

@@ -67,7 +67,7 @@ def test_merge_from_disk_new(tmp_path):
for asset_key, asset in item.get_assets().items()
}
assert asset_workspace_uris == {
"asset.tif": f"file:{workspace.root_directory / 'path' / 'to' / 'collection.json_items' / 'asset.tif'}"
"asset.tif": f"file:{workspace.root_directory / 'path' / 'to' / 'collection.json_items' / 'asset.tif' / 'asset.tif'}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason for now having filename twice in the path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path of the asset is the collection path + item id + asset filename with item id = asset.tif and asset filename = asset.tif. The merge now takes the path relative to the collection path, so item id + asset filename which is twice the asset.tif. See

asset_path = root_path / item.id / asset_filename

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into it further but I'm not sure about this.

The problem with the current workspace implementations is that they assume that the asset key contains a relative path (relative to the job directory).

With the introduction of unified asset keys, they no longer have any meaning (could be anything) so this assumption no longer holds and this shortcut in the code has to be removed and implemented in a different way.

It's true that the dummy STAC items in these tests mirror the current implementation in that their item ID == asset key == relative asset path; this is a bit confusing with the new implementation in mind but at the same time the actual item ID should not matter.

I would expect the workspace URI (= the URI of an asset that it gets as it is exported to a workspace) to remain the same: /path/to/collection.json_items/asset.tif and the item ID is not involved.

Copy link
Collaborator

@bossie bossie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly fine but didn't fully support the asset_per_band format option; that only became apparent in openeo-geopyspark-driver's test_batch_result.test_export_workspace_merge_filepath_per_band so I fixed it.

I also made it backwards compatible with the old (non-unified asset keys) implementation. That allows us to keep running the tests for this old implementation (important because this feature is still behind a feature flag) without checking the feature flag here as well.

@bossie bossie merged commit de52b92 into master Jul 10, 2025
1 check passed
@bossie bossie deleted the issue1111-export-workspace branch July 10, 2025 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unify asset keys across STAC items
3 participants